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Abstract — We study adaptive network coding (NC) for schedul- 
ing real-time traffic over a single-hop wireless network. To meet 
the hard deadlines of real-time traffic, it is critical to strike a 
balance between maximizing the throughput and minimizing the 
risk that the entire block of coded packets may not be decodable 
by the deadline. Thus motivated, we explore adaptive NC, where 
the block size is adapted based on the remaining time to the 
deadline, by casting this sequential block size adaptation problem 
as a finite-horizon Markov decision process. One interesting 
finding is that the optimal block size and its corresponding 
action space monotonically decrease as the deadline approaches, 
and the optimal block size is bounded by the "greedy" block 
size. These unique structures make it possible to narrow down 
the search space of dynamic programming, building on which 
we develop a monotonicity-based backward induction algorithm 
(MBIA) that can solve for the optimal block size in polynomial 
time. Since channel erasure probabilities would be time-varying 
in a mobile network, we further develop a joint real-time 
scheduling and channel learning scheme with adaptive NC that 
can adapt to channel dynamics. We also generalize the analysis to 
multiple flows with hard deadlines and long-term delivery ratio 
constraints, devise a low-complexity online scheduling algorithm 
integrated with the MBIA, and then establish its asymptotical 
throughput-optimality. In addition to analysis and simulation 
results, we perform high fidelity wireless emulation tests with real 
radio transmissions to demonstrate the feasibility of the MBIA 
in finding the optimal block size in real time. 

Index Terms — Network coding, real-time scheduling, wireless 
broadcast, deadlines, delay, throughput, resource allocation 

I. Introduction 

The past few years have witnessed a tremendous growth 
of multimedia applications in wireless systems. To support 
the rapidly growing demand in multimedia traffic, wireless 
systems must meet the stringent quality of service (QoS) re- 
quirements, including the minimum bandwidth and maximum 
delay constraints. However, the time-varying nature of wireless 
channels and the hard delay constraints give rise to great 
challenges in scheduling multimedia traffic flows. In this paper, 
we explore network coding (NC) to optimize the throughput 
of multimedia traffic over wireless channels under the hard 
deadline constraint. 

In capacitated multihop networks, NC is known to optimize 
the multicast flows from a single source to the min-cut 
capacity [1]. NC also provides coding diversity over unreliable 
wireless channels and improves the throughput and delay 
performance of single-hop broadcast systems, compared to 
(re)transmissions of uncoded packets [2]-[8]. Nevertheless, 



the block NC induces "decoding delay," i.e., receivers may 
not decode network-coded packets until a sufficient number of 
innovative packets are received. Therefore, the minimization 
of NC delay has received much attention (e.g., [9]— [12]). 

For multimedia traffic, meeting the deadline may be more 
critical than reducing the average delay. Under the hard 
deadline constraints, NC may result in significant performance 
loss, unless the receivers can decode the packets before the 
deadline. Different NC mechanisms (e.g., [13]— [16]) have 
been proposed recently to incorporate deadline constraints. An 
immediately-decodable network coding (IDNC) scheme has 
been proposed in [14] to maximize the broadcast throughput 
subject to deadlines. A partially observable Markov decision 
process (POMDP) framework has been proposed in [15] to op- 
timize media transmissions with erroneous receiver feedback. 
These works focus on optimizing network codes in each 
transmission; however, such an approach is typically not 
tractable due to the "curse of dimensionality" of dynamic 
programming. To reduce the complexity of optimizing network 
codes in each transmission, [16] has formulated a joint coding 
window selection and resource allocation problem to optimize 
the throughput in deadline-constrained flows. However, the 
computational complexity can be still overwhelming due to the 
finite-horizon dynamic programming involved in the coding 
window selection. To overcome this limitation, [16] has pro- 
posed a heuristic scheme with fixed coding window to tradeoff 
between optimality and complexity. 

A primary objective of this study is to (i) explore optimal 
adaptive NC schemes with low computational complexity, 
and (ii) integrate channel learning with adaptive NC over 
wireless broadcast erasure channels. Our main contributions 
are summarized as follows. 

• We develop an adaptive NC scheme that sequentially 
adjusts the block size (coding block length) of NC to 
maximize the system throughput, subject to the hard 
deadlines (cf. [16]). We show that the optimal block 
size and its corresponding action space monotonically 
decrease as the packet deadline approaches, and the 
optimal block size is bounded by the "greedy" block size 
that maximizes the immediate throughput only. These 
unique structures make it possible to narrow down the 
search space of dynamic programming, and accordingly 
we develop a monotonicity-based backward induction 
algorithm (MBIA) that can solve for the optimal block 
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Fig. 1: System model. (The arrow denotes the time instant for 
drops of undelivered packets and arrivals of new packets.) 



size in polynomial time, compared with [15], [16]. We 
also develop a joint real-time scheduling and channel 
learning scheme with adaptive NC for the practical case, 
in which the scheduler does not have (perfect) channel 
information. 
• We generalize the study on adaptive NC to the case 
with multiple flows. We develop a joint scheduling and 
block size adaptation approach to maximize the weighted 
system throughput subject to the long-term delivery ratio 
and the hard-deadline constraint of each flow. By inte- 
grating the MBIA in the model with multiple flows, we 
construct a low-complexity online scheduling algorithm. 
This online algorithm is shown to be throughput optimal 
in the asymptotic sense as the step size in iterations 
approaches zero. 
> We implement the adaptive NC schemes in a realistic 
wireless emulation environment with real radio transmis- 
sions. Our high fidelity testbed results corroborate the 
feasibility of the MBIA in finding the optimal block 
size in real time. As expected, the adaptive NC scheme 
with the MBIA outperforms the fixed coding scheme, 
and the proposed scheme of joint real-time scheduling 
and channel learning performs well under unknown and 
dynamic channel conditions. 
The rest of the paper is organized as follows. In Section 
II, we introduce the system model and present the block size 
adaptation problem with the hard deadlines. In Section III, we 
develop the MBIA to solve for the optimal block size and 
building on this we devise the joint real-time scheduling and 
channel learning scheme with adaptive NC for the case with 
unknown channel information. In Section IV, we generalize 
the study on adaptive NC to multiple flows. In Section V, 
we implement the adaptive NC schemes and test them in a 
realistic wireless emulation environment with hardware-in-the- 
loop experiments. We conclude the paper in Section VI. 

II. Throughput Maximization vs. Hard Deadline 

A. System Model 

We consider a time-slotted downlink system with one trans- 
mitter (e.g., base station) and N receivers (users), as illustrated 



in Fig. 1. Time slots are synchronized across receivers and 
the transmission time of a packet corresponds to one time 
slot. The transmitter broadcasts M packets to N receivers 
over i.i.d. binary erasure channels with erasure probability 
e. 1 We assume immediate and perfect feedback available at 
the transmitter. For multimedia communications, it is standard 
to impose deadlines for delay-sensitive data (see, e.g., [4], 
[14]— [17]). We assume that packets must be delivered to each 
receiver before T slots, i.e., the deadline of each packet is T 
slots. Any packet that cannot be delivered to all receivers by 
this deadline is dropped without contributing to the throughput. 

Worth noting is that this model can be readily applied 
to finite-energy systems with NC, where the objective is to 
maximize the system throughput before the energy is depleted 
for further transmission. Therefore, the energy and delay 
constraints can be used interchangeably. 

In Section III, we consider the basic model with one flow 
and one frame of T slots. In Section IV, we generalize the 
model to multiple frames with multiple flows, where packets 
arrive at the beginning of each frame and they are dropped if 
they cannot be delivered to their receivers by the deadline of 
T slots. 

B. Network Coding for Real-time Scheduling 

As noted above, the throughput gain of NC comes at the 
expense of longer decoding delay (since packets are coded 
and decoded as a block), which may reduce the throughput 
of the system due to the hard deadline constraints. Let K 
denote the block size, i.e., the number of original packets 
encoded together by NC. We assume that the transmitter and 
each receiver know the set of coding coefficients, and the 
transmitter broadcasts the value of K to receivers before the 
NC transmissions start. The coding coefficients can also be 
chosen randomly from a large field size (or from a predeter- 
mined coding coefficient matrix of rank K) such that with 
high probability K packet transmissions deliver K innovative 
packets in coded form to any receiver, i.e., the entire block of 
packets can be decoded after K successful transmissions. As 
shown in [2], the probability that all receivers can decode the 
block of size K within T slots is given by 
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where ( n ) denotes the number of combinations of size m out 
of n elements. Note that (1) strongly depends on the choice 
of block size K and we can show that, 

Lemma 2.1. The decoding probability (1) is monotonically 
decreasing with K for fixed T. 

With block NC, there is the risk that none of the packets can 
be decoded by the receivers before the hard deadline. By using 

'The results derived in the paper can be readily applied to heterogeneous 
channels with different erasure probabilities. 

"We can also employ random NC with a finite field size q. This would 
change the decoding probability (1) to a function of q. However, the general 
structure of the results will remain the same. 



IDNC, it may be possible to start decoding without waiting 
for the entire block to arrive but the complexity of finding 
a suitable code may be overwhelming due to the dynamic 
programming involved in the problem [14]. Here, we provide 
the throughput guarantees for the worst-case scenario, where 
either the whole block or none of the packets can be decoded 
at any slot. There is a tradeoff between the block size and the 
risk of decoding. In particular, we cannot greedily increase K 
to maximize the system throughput under the hard deadline 
constraints, since the risk that some receivers cannot decode 
the packets, i.e., 1 — P(K,T), also increases with K according 
to Lemma 2.1. 

If the first block is delivered within the deadline, i.e., T 
slots, the size of a new block (with new packets) needs to be re- 
adjusted for the remaining slots. In other words, we need real- 
time scheduling of network-coded transmissions depending on 
how close the deadline is. For example, when there is only one 
slot left before the deadline, the optimal block size is 1 , since 
for any K > 1, no receivers can decode the packets before 
the deadline. Also, the block size in a given slot statistically 
determines the remaining slots (before the deadline) along with 
the future system throughput. In Section III, we derive the 
optimal block size adaptation policy to maximize the system 
throughput under the deadline constraints for one frame with 
one flow. In Section IV, we generalize the results to the case 
with multiple frames with multiple flows. 

C. Problem Formulation: A Markov Decision Process View 

The NC-based multimedia traffic scheduling of one frame 
is a sequential decision problem, which can be formulated as 
a Markov decision process (MDP) described as follows. 

Horizon: The number of slots available before the deadline 
over which the transmitter (scheduler) decides the block size 
is the horizon. Due to the hard deadline, this MDP problem 
is a finite horizon problem with T slots (one frame). 

State: The remaining slots t 6 {0, 1, ...,T} before the hard 
deadline is defined as the state, 3 where t = denotes that 
there is no slot left for transmissions. 

Action: Let K t , t <E {1,...,T}, denote the action taken at 
state t, which is the block size for the remaining t slots. Let 
M t denote the number of packets undelivered at state t. Thus, 
at state t > 0, K t can be chosen from 1 to min(£, M t ). 
For t = 0, the transmitter stops transmitting any packet, 
i.e., Kq = 0. In general, the action space is defined as 
/C t = {0,l,...,min(i,M t )}. 

Expected immediate reward: For the remaining t slots, the 
expected immediate reward is the expected number of packets 
successfully decoded by all receivers, which is given by 



R t (K t ) = K t P(K t ,t), 



(2) 



where P(K t ,t) is given by (1), denoting the probability that 
each receiver can decode these K t packets within t slots. 

Block size adaptation policy: A block size adaptation policy 
V is a sequence of mappings, V = {Vt}J = i, from t, M t , 

3 We use the terms "state" and "slot" interchangeably. 



e, and TV to an action K t G {0, 1, ...,min(£, Mi)}, i.e., 
K t = Vt(t,M t ,e,N) = mm(V t (t,e,N),M t ). Without loss 
of generality, in Section III, we assume that M t is always 
larger than t, i.e., K t € {0, 1, ...,£}. This does not change the 
monotonicity structure of the block size with state t. We will 
discuss these structural properties in detail in Section III. 

Total expected reward: Given the adaptation policy V, the 
total expected reward for the remaining t slots is given by 



V t {K t ;P) = Rt(K t ) + EMiKjiV)] 

= Rt(K t ) + *£* QtUWiKjiV), 

3=0 



(3) 



where the probability mass function qt(j) = P(K t ,t — j) — 
P(K t , t — j—1) denotes the probability that the block of size 
K t is delivered over exactly j slots before the deadline. 

III. Network Coding with Adaptive Block Size 

A main contribution of this paper is the development and 
analysis of the polynomial-time monotonicity-based backward 
induction algorithm (MBIA). The design of the MBIA is 
motivated by the structures of the optimal and the greedy 
policies that are formally defined as follows. 

Definition 3.1. A real-time scheduling policy with adaptive 
network coding is optimal, if and only if it achieves the 
maximum value of the total expected reward given by the 
Bellman equation [18] in dynamic programming: 

V t (K h V*) = ^mox jRtiKt) 

t-K t (4) 

+ E qt(.m(K*;V*)}, 
i=o 

where K£ denotes the optimal block size, V* denotes the 
optimal block size adaptation policy, and the terminal reward 
is given by V (0;V*) = 0. 

Definition 3.2. The greedy policy maximizes only the expected 
immediate reward (2) without considering the future rewards 
and the greedy decision is given by 



K t = argmax R t {K t ). 

K t e{Q, i,...,t} 



(5) 



A. Optimal Block Size Adaptation Policy 

In each slot t, the optimal policy balances the immediate 
reward and the future reward by selecting a suitable block size 
K^. In general, the approach of solving for the optimal block 
size by traditional dynamic programming [18] suffers from the 
"curse of dimensionality," where the complexity of computing 
the optimal strategy grows exponentially with t. However, the 
optimal block size and its corresponding action space exhibit 
the monotonicity structures, and the optimal block size is 
bounded by the greedy block size. These unique structures 
make it possible to narrow down the search space of dynamic 
programming, and accordingly we develop a monotonicity- 
based backward induction algorithm (MBIA) with polynomial 
time complexity. 

The MBIA searches for the optimal block size by backward 
induction and provides the optimal block size for each system 
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Fig. 2: The unimodal property of R t {K t ). 



state. Depending on the remaining time to deadline, the 
scheduler transmits coded packets with the optimal block size 
until each user receives enough packets to decode this block. 
Then, the scheduler adjusts the block size based on the current 
state, and proceeds with the new block transmission. This 
continues until the packet deadline expires or all packets are 
delivered. We present next the structural properties of block 
size adaptation problem that will lead to the formal definition 
of the MBIA. 

Lemma 3.1. The action space K.t monotonically shrinks as t 
decreases. 

Proof outline: As the number of remaining slots t decreases, 
the maximum possible block size decreases as well, since 
K t G {0,1, ..., i}; otherwise no receiver can decode the block 
of coded packets. □ 

Proposition 3.1. The expected immediate reward function 
Rt(Kt) has the following properties: 

1) Rt(K t ) is unimodal for K t G {0, 1, ...,t}. 4 

2) K t in (5) monotonically decreases as t decreases. 

Proof outline: To show the unimodal property, it suffices 
to show that R t (K t ) is log-concave, which can be shown by 
using induction method. The monotonicity property of K t can 
be shown by invoking the contradiction argument and applying 
Kmt^oo R t (K t ) = K t . D 

Fig. 2 shows the possible curves of R t {K t ) for different 
values of t, illustrating the unimodal property of R t (K t ) 
formally stated in Proposition 3.1. Based on Proposition 3.1, 
the monotonicity property of the optimal block size K£ is 
given by the following theorem. 

Theorem 3.1. The optimal block size K% monotonically 
decreases as t decreases, i.e., K% > A" ( *_ 1; for any t. 

Proof outline: Based on Proposition 3.1, we can show that 

4 /(x) is a unimodal function if for some m, f(x) is monotonically 
increasing for x<m and monotonically decreasing for x>m. The maximum 
value is attained at x = m and there are no other local maximum points. 



if K* t < K;_ x , V t -i{K*;V*) > V t -i{K^ i; P*), which 
contradicts that K*_ t is the optimal action in slot t — 1. □ 

As t decreases, the risk that some receivers cannot decode 
the given block of packets increases for a fixed block size. 
Therefore, the scheduler becomes more conservative in the 
block size adaptation and selects a smaller block size. 

Remarks: 1) For t = 1, 2, the optimal block size is K£ = 1, 
which can be obtained by computing the Bellman equation (4). 
2) When TV = 1, the optimal block size is A' t * = 1 for all t, 
since the plain retransmission policy with K t = 1 is better 
than the block coding with K t > 1 in the presence of the 
hard deadlines. 



Theorem 3.2. The optimal block size K^ 
the greedy block size K t for any t. 



not greater than 



k; 



Proof outline: Based on Proposition 3.1, we can show if 
> Kt, along any sample path, the system throughput 
by taking the action K t is at least as high as that by taking 
the action K£, which contradicts that K£ in this case is the 



optimal action in slot t. 



□ 



Corollary 3.1. At state t, if R t {K t ) > Rt{K t + 1), then 
K* < Kj < K t for any j G {1, ..., t}. 

Corollary 3.1 follows directly from Proposition 3.1 and 
Theorem 3.2. Based on these structural properties, we develop 
the MBIA, which is presented in Algorithm 1. 

Algorithm 1 Monotonicity-based Backward Induction Algo- 
rithm (MBIA) 

1) Sett = 0and ^(0;^*) =0. 

2) Substitute t + 1 for t, and compute V t {K%;V*) by 
searching K t G /Q, where K, t = {K^_ 1 ,K^_ 1 + 1, ..., K t }, 

i.e., V t (K*;V*) = max {RtiK^+^lUW^K*^*)}, 

AttM j — 

and K* = argmax{i? t (A t ) + *£* q(j)V 3 (K*; V*)}. 

K t eK t 3=0 

3) If t = T, stop; otherwise go to step 2. 

The MBIA confines the search space at state t to the 
interval from K%_ x (the optimal policy at state t — 1) to K t 
(the greedy policy at state t). Thus, the MBIA reduces the 
search space over time and reduces the complexity of dynamic 
programming as given by the following theorem. 

Theorem 3.3. The MBIA is a polynomial-time algorithm and 
the complexity is upper bounded by 0(T 2 ). 

Proof: Based on Proposition 3.1, Rt(K t ) has the unimodal 
property and therefore K t can be solved efficiently by the 
Fibonacci search algorithm [19], which is a sequential line 
search algorithm with a complexity of 0(log(t)) at state t. 
Therefore, in each iteration, it takes 0(log(i) + K t — K^i) 
slots to find K\ '. Based on Lemma 3.1, K t — Kl_ x is 
upper bounded by t. After some algebra, we show that the 
complexity of Algorithm 1 is bounded by 0(T 2 ) and Theorem 
3.3 follows. □ 




Fig. 3: The monotonicity property of e*. 



Remarks: By using the MBIA, the optimal block size can 
be computed in polynomial time, which is a desirable property 
for online implementation. The optimal block size depends on 
the number of receivers and channel erasure probabilities. For 
different flows, the set of receivers may be different, which 
may result in different optimal block sizes, even when the 
number of remaining slots is the same across these flows. 
Therefore, without using the MBIA, offline schemes would 
need to compute the optimal policies for all possible receiver 
sets; however, this would be a computationally demanding 
task, as the number of receivers increases. 

Based on the monotonicity properties of the greedy and 
optimal block sizes, the optimal policy becomes the plain 
retransmission, if the channel erasure probability is sufficiently 
large. This sufficiency condition for A' t * = 1 at slot t is 
formally given as follows. 

Theorem 3.4. At slot t, the optimal policy switches to the 
plain retransmission policy, i.e., A t * = 1, when the erasure 
probability satisfies the threshold condition 

e>e*(t,N), (6) 

where e*(t,N) 6 (0, 1) is the non-trivial (unique) solution to 
flt(l) = R t {2). 

Proof outline: The proof follows directly by comparing 
i?t(l) and i?t(2) that are expressed as a function of e. □ 

Note that (6) is a sufficient condition only and indicates the 
optimality of the greedy policy when e is large enough. 

Fig. 3 depicts how the threshold e* varies with t and N. 
The underlying monotonicity property is formally stated in 
Corollary 3.2. 

Corollary 3.2. The threshold e*(t,N) increases monotoni- 
cally with t and decreases monotonically with N. 

Remarks: 1) When the channel is good enough (with 
e < e*), NC with K t > 1 can always improve the throughput 
compared to the plain retransmission policy. 2) As t increases 
(i.e., the deadline becomes looser), the risk of decoding 



network-coded packets decreases, i.e., e*(t,N) increases. 3) 
As N increases, it becomes more difficult to meet the deadline 
for each of N receivers and therefore e*(t,N) drops accord- 
ingly. 

B. Robustness vs. Throughput 

The real-time scheduling policies presented so far focus 
on the expected throughput without considering the variation 
from the average performance. Therefore, it is possible that the 
instantaneous throughput drops far below the expected value. 
To reduce this risk, we use additional variation constraints to 
guarantee that the throughput performance remains close to 
the average. In particular, for each slot t, we introduce the 
variation constraint to the block size adaptation problem as 
follows: 

v t {K t )<al MK t eK t , (7) 

where a\ is the maximum variation allowed in slot t and the 
performance variation v t (K t ) under action K t is given by 



v t (K t )= J2i 2 {P(K u i)-P(K u i-l)). 



(8) 



Since v t (K t ) increases with K t , (7) can be rewritten as the 
maximum block size constraint for each slot t, i.e., 



K t < A t max , 



(9) 



where Kg 1 "* 



= max{A t |A t = [v t 1 (a t )\}, v t \-) is the 
inverse mapping of Vt(-), and [x\ denotes the largest integer 
smaller than x. The variation constraints do not change the 
monotonicity property of the optimal block size provided 
by Theorem 3.1. By introducing the variation constraints 
(9), the scheduler becomes more conservative in the block 
size adaptation. The additional bound K" laK can be easily 
incorporated into the MBIA by changing the action space to 
K t = {Kl_ x ,K* t _ x + 1, ...,min(A7 iax , K t )} at state t. 

C. Block Size Adaptation under Unknown Channels 

So far we have discussed the real-time scheduling policies 
with adaptive NC, where the channel erasure probability e is 
perfectly known to the scheduler. The throughput performance 
of these policies depends on e; therefore, the scheduler needs 
to learn e while adapting the block size, when it does not 
have (perfect) channel knowledge. Let e t denote the estimate 
of the channel erasure probability in slot t. The scheduler can 
update e t based on the feedback from the receivers. In slot 
T, if It < e, we would expect with high probability that a 
block of packets with the size that is calculated with respect 
to ir cannot be delivered before the deadline. Therefore, it is 
better to select the block size conservatively at the beginning, 
when the estimate e t cannot be highly accurate yet, because 
of the small number of samples. As i t improves over time, the 
block size can be gradually increased to improve the system 
throughput. Once the estimate is close enough to the actual 
value of e after enough samples are collected, the block size 
should be adjusted (and reduced over time) according to the 
MBIA. 



Clearly, there is a tradeoff between the channel learning and 
the block size adaptation. Here, we formulate a joint real-time 
scheduling and channel learning algorithm (Algorithm 2) to 
adapt the block size while updating the maximum likelihood 
estimate e t of channel erasure probability. In slot t, based 
on the feedback, the scheduler can compute the packet loss 
ratio, et = 1 — jjj-, where n t denotes the number of receivers 
that successfully receive a packet in slot t. Accordingly, the 
estimated channel erasure probability i t is given by the moving 
average 

h = (T - T ^i +£t - ( 10 > 

The scheduler decides on the block size by comparing the 
temporal variation \i t — it+i\ with a threshold 6. A detailed 
description is given in Algorithm 2. 

Algorithm 2 Joint Real-time Scheduling and Channel Learn- 
ing with Adaptive Network Coding 

Initialization: Choose threshold 6 and set Kt = 1. 
Repeat until t = 0. 

Update channel estimate i t by (10). 
Compute block size K\ by Algorithm 1 with i t . 
If \e t -e t+ i| > 8 then 
If K; > K t +i + 1 then 

K t = K t+ i + 1, 
Else 

K t = K t+1 . 
Endif 
Else 

K t = K*. 
Endif 

Remarks: Algorithm 2 captures the tradeoff between the 
channel learning and block size adaptation. There are two op- 
tions for the scheduler depending on the relationship between 
\it — £t+i\ and S. If the channel estimation is not yet good 
enough, Algorithm 2 chooses the block size conservatively 
by incrementing K t by at most 1. Otherwise, Algorithm 2 
computes the block size by applying the MBIA. 

D. Performance Evaluation 

Fig. 4 illustrates for N = 5 the mono tonicity structure of 
the optimal block size (Theorem 3.1) and verifies that K% < 
K t (Theorem 3.2). Both the optimal and greedy block sizes 
increase when the channel conditions improve (from e = 0.5 to 
e = 0.2). Next, we evaluate the performance (average system 
throughput) of different policies. For comparison purposes, we 
also consider a soft delay-based conservative policy, where 
the scheduler chooses the largest block size with the expected 
completion time less than or equal to the number of remaining 
slots. The expected completion time is studied in [2], and it 
is given by 



S(K) = K 



E (i-P(K,t)) 

t=K 
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Fig. 5 compares the performance of the optimal, greedy, 
conservative and plain retransmission policies for TV = 10 and 



Fig. 5: Performance (average system throughput) comparison 
of different policies. 



T = 10. The plain retransmission policy always performs the 
worst, whereas the conservative policy performs worse than 
the greedy policy. However, as e increases, all policies select 
smaller block sizes and their performance gap diminishes. 

Fig. 6 shows the tradeoff between the average system 
throughput and the throughput variation. When the channels 
are good (e.g., e = 0.1 in Fig. 6), the variation constraint (7) 
makes the scheduler choose a small block size, which reduces 
the average system throughput accordingly. However, there is 
no significant effect of (7) when channels are bad (e.g., e = 0.5 
in Fig. 6), since the scheduler already chooses a small block 
size for large e. Fig. 7 evaluates the performance of Algorithm 
2 under channel uncertainty and show that Algorithm 2 is 
robust with respect to the variation of S and achieves a reliable 
throughput performance close to the case with perfect channel 
information. 
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Fig. 6: Average system throughput vs. throughput variation, 
where TV = 10 and T = 20. 
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IV. Joint Scheduling and Block Size Optimization 

In this section, we generalize the study on adaptive NC 
to the case of multiple frames, where the scheduler serves 
a set T of flows subject to the hard deadline and the long- 
term delivery ratio constraints. The packets of each flow / 
arrive at the beginning of every frame and they are dropped 
if they cannot be delivered to its receivers A// within this 
frame (see Fig. 1). We impose that the loss probability for flow 
/ due to deadline expiration must be no more than 1 — qf, 
where qf is the delivery ratio requirement of flow /. For a 
given frame, the vector a = (a/)/ e jp denotes the number of 
packet arrivals at each flow, where a/ is the number of packets 
generated by flow /. We assume that a,f is i.i.d. across frames 
with finite mean A/ and variance 5 . For ease of exposition, 
we assume perfect channel information at the scheduler and 
consider coding within each flow but not across different flows. 

A. Multi-Flow Scheduling 

The scheduler allocates slots for each flow and uses the op- 
timal real-time scheduling policy with adaptive NC developed 
in Section III to transmit network-coded packets. Given the 
arrivals, the scheduler needs to allocate a suitable number of 
slots for each flow to satisfy the delivery ratio requirement. 
This resource allocation is defined as a feasible schedule, 
s = (sf)f e jr, where Sf denotes the number of slots allocated 



to flow / and Yl 



fer s f 



< T. Our goal is to maximize the 



weighted throughput subject to the delivery ratio and hard 
deadline constraints. We find the optimal schedule, i.e., the 
probability Pr(s\a) that given the arrivals a, the schedule 
s 6 S is used from the set S of all feasible schedules. Then, 
the expected service rate for flow / is upper-bounded by 



H <J2s,a c f( s f) Pr ( s \ a ) Pr ( a )^ 



(12) 



5 The algorithm developed for the i.i.d. case can be readily applied to non 
i.i.d. scenarios. The analysis and performance guarantees can be obtained 
using the delayed Lyapunov drift techniques developed in [22], [23]. 



where c/(s/) is the expected number of packets that can 
be delivered under schedule Sf, which is a constant and 
can be solved by the MBIA. Hence, we formulate the joint 
resource allocation and block size adaptation as the following 
optimization problem: 

maximize ^ f g:f w fl i f 

subject to fj,f > Xjqj, V / E J 7 , 

M < Es,aCf(s f )Pr( S \a)Pr(a), V / € T, 
Pr(s\a) > 0, Vs g S, Eses Pr (s\a) < 1, Vo, 
variables {fif,Pr(s\a)}, 

(13) 
where Wf is the weight for flow / and can be used as a fairness 
metric for resource allocation to each flow. 6 Note that (13) 
generalizes the problem studied in [21] by using adaptive NC 
schemes in packet transmissions. 

B. Dual Decomposition 

Since (13) is strictly convex, the duality gap is zero from 
the Slater's condition [20]. The dual problem is given by 



maximize 
subject to 



E/&F ( w /^/ + MM ~ X f1f)) 



variables 



/*/ < E.s a c f (8 f )Pr(8\a)Pr(a), V / G T, 
Pr{s\a) > 0, Vs £ S, E s es Pr ( s \ a ) < *> Va > 
{fi,f,Pr(s\a)}, 

(14) 
where vj is the Lagrangian multiplier for flow /. The objective 
function of (14) is linear and the upper bounds for /x/ are 
affine functions. Therefore, the optimization problem (14) can 
be rewritten as: 



m axE/e.F<>/ + M c f( s f)- 

aCS J 



ses 



(15) 



The problem (13) can be generalized to the case with congestion control 
by treating the weights as virtual queues for flow rates (similar to the service 
deficit queues that we use later in Section IV-B). 



Thus, we have the following gradient-based iterative algorithm 
to find the solution to the dual problem (14), 

s*(fc) e aigmaxJ2 fe A w f + v f{ k )) c f( s f)i 

fi* f (k) = Cfts}(k)): (16) 

v f{k + !) = max(0, Uf{k) + p(X f q f - p* f (k))), 

where k is the step index, p > is a fixed step-size parameter, 
and Cf(s*Jk)) is the expected service rate for flow / under 



schedule s*Jk). Letting Of(k) 



"/(fc) 



, (16) is rewritten as 



s*(k) G argmax^ /£ ^(^ + i> f (k))c f (s f ), 

u f (k + 1) = max(0, v f {k) + (X f q f - p,* f (k))). 

Remarks: The update equation for Vf can be interpreted 
as a virtual queue for the long-term delivery ratio with the 
arrival rate Xfqf and the service rate p* f (k), which keeps track 
of the deficit in service for flow / to achieve a delivery ratio 
greater than or equal to qj. Note that (17) provides only the 
static solution to (14). Next, we provide an online scheduling 
algorithm which takes into account the dynamic arrivals of the 
flows. 

C. Online Scheduling Algorithm 

The online scheduling algorithm is given by 



s*(k) G argmaxX; /e ^(^- + %(fc))c/(s/), 

s£S 

Vf{k + 1) = max(0, Vf(k) + cif(k) — Cf(s*Jk))), 



(18) 



where Cf(s*Jk)) denotes the actual delivered number of 
packets under the schedule s%(k) depending on the channel 
realizations, and a/(fc) is a binomial random variable with 
parameters a/(fc), the number of packet arrivals of flow / 
in the kth frame, and qf. This implementation for a/ (k) was 
proposed in [21]. At the beginning of each period, the schedule 
s*(k) is determined by (18). Then, the packets of each flow / 
are transmitted with the MBIA in the scheduled s*Jk) slots. 
The virtual queue i/f is updated based on the number of 
successfully delivered packets Cf(s*Jk)) of each flow /. With 
Lyapunov optimization techniques [22], [23], it can be shown 
that (18) has the following properties. 

Theorem 4.5. Consider the Lyapunov function L(0) = 
i ^2 f£ jr &*. If p*r > Xfqf for all f £ T, then the expected 
service deficit Vf is upper-bounded by 

limsup£[£ /e ^ *>/(&)] < B x + ±B 2 , 

k— >-oo 

for some positive constants B\ and B 2 . Furthermore, the 
online algorithm can achieve the long-term delivery ratio 
requirements, i.e., for all f £ T we have 

liminf^[^Ef = iC/( S )(fc))]>A /(Z/ . 

Theorem 4.6. Let p > and p** be the solution to (17). If 
p*f > Xfqf for all f G J-, it follows for B > that 

UmsupE[}2 fe jr(wfp f ~ T^EfcLiC/0}(/c)))] < Bp. 

K->oo 



The proofs follow from the optimization framework in [22], 
[23] and they are similar to the proofs presented in [21]. Note 
that the online scheduling algorithm (18) can approach within 
O(p) of the optimal solution to (14) and does not require any 
knowledge of the packet arrival statistics. 

D. Performance Evaluation 

We consider a network with two flows, each with five 
receivers. The packet traffic of each flow follows Bernoulli 
distribution with mean A/ packets/frame for / = 1,2, and the 
length of each frame is 10 slots. In the simulation, we set 
Xf — X for / = 1, 2. The channel erasure probability e is 0.3, 
the weights Wf are 1 for all flows, the step-size p is 0.1, and 
the simulation time is 10 5 frames. 

We evaluate the performance of our algorithm by com- 
paring the region of achievable rates (pi,p 2 ) with the plain 
retransmission under different traffic flow rates A, where the 
achievable rates denote the feasible solution to (13) for given 
delivery ratio requirements qf. By varying qf, we find the 
achievable rate region. As illustrated in Fig. 8, the plain 
retransmission only achieves a small fraction of the region 
with adaptive NC. By using adaptive NC, the network can 
support flows with heavier traffic. 

Fig. 9 shows the average service deficit v of two flows. The 
delivery ratio requirement of each flow is 0.8. As A increases, 
v grows unbounded, which means that the conditions, p* f > 
Xfqf for all / <G T, are not satisfied, i.e., the arrival rates 
are not in the "stability" region, and the online scheduling 
algorithm cannot meet the delivery ratio requirements. 

V. High Fidelity Wireless Testing with Hardware 
Implementation 

We tested the adaptive NC schemes in a realistic wireless 
emulation environment with real radio transmissions. As il- 
lustrated in Fig. 10, our testbed platform consists of four 
main components: radio frequency network emulator simulator 
tool, RFnest™ [24] (developed and owned as a trademark 
by Intelligent Automation, Inc.), software simulator running 
higher-layer protocols on a PC host, configurable RF front- 
ends (RouterStation Pro from Ubiquiti), and digital switch. 
We removed the radio antennas and connected the radios 
with RF cables over an attenuator box. Then, real signals 
are sent over emulated channels, where actual physical-layer 
interactions occur between radios, and in the meantime the 
physical channel attenuation is digitally controlled according 
to the simulation model or recorded field test scenarios can be 
replayed. 

In the hardware experiments, we executed wireless tests 
at 2.462GHz channel with lOdBm transmission power and 
1Mbps rate. We used CORE (Common Open Research Emu- 
lator) [25] to manage the scenario being tested. We changed 
the locations of receivers through RFnest™GUI and let the 
signal power decay as dr a over distance d with path loss 
coefficient a = 4. By using real radio transmissions according 
to this model, we varied the attenuation from the transmitter to 
each of the receivers and generated different channel erasure 
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Fig. 8: Achievable rate regions under adaptive NC and plain 
retransmission policies. 



probabilities. With RFnest™, we replayed the same wireless 
traces for each of the NC algorithms and compared them under 
the high fidelity network emulation with hardware-in-the-loop 
experiments. 

Fig. 11 illustrates the performance of the optimal policy, 
the greedy policy and the fixed block size policy suggested by 
[16]. The experimental results show that the greedy policy 
performs close to the optimal policy in practice. Both the 
greedy and the optimal policies outperform the fixed block size 
policy, and the complexity remains low with the polynomial- 
time algorithm MBIA. Fig. 12 illustrates the wireless test 
performance for the case when the unknown channel erasure 
probabilities are learned over time. Algorithm 2 performs close 
to optimal in this case and converges quickly in several frames. 

VI. Conclusion 

We considered adaptive NC for multimedia traffic with hard 
deadlines and formulated the sequential block size adapta- 
tion problem as a Markov decision process for a single-hop 
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Fig. 10: Programmable RFnest™testbed. 

wireless network. By exploring the structural properties of 
the problem, we derived the polynomial time policy, MBIA, 
to solve the optimal NC block size adaptation problem and 
developed the joint real-time scheduling and channel learning 
scheme that can adapt to wireless channel dynamics if the 
perfect channel information is not available at the scheduler. 
Then, we generalized the study to multiple flows with hard 
deadlines and long-term delivery constraints, and developed 
a low-complexity online scheduling algorithm integrated with 
the MBIA. Finally, we performed high fidelity wireless emula- 
tion tests with real radios to demonstrate the feasibility of the 
MBIA in finding the optimal block size in real time. Future 
work should extend the model to integrate congestion control 
with adaptive NC and real-time scheduling under deadline 
constraints. 
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Appendix 

A. Proof of Lemma 2.1 

P(K, T) is monotonically decreasing with K, if P(K, T) 
is monotonically decreases with K, where 



P(K,T)= £ (-\) e -*(l-e) 

T = K 



A 



(19) 



such that P(K,T) = (P{K,T)) N . First, we express 
P(K,T) -eP(K,T) 

= (i-e)*+ e (cr-D - Gr_ 2 i)) ^-"(i 

T=A"+1 V 

= &-*)«+ J: fi--*y- 

_ { T-i i y-T K K t ) \ 1 _ e)K 

= (l-e)fE v 



\K 



\-K 



\A 



'T-2^r-X (1 _ e )X-l 



= A 



A 



(1-e 



iD^-'fl-e) 



/r-l 
\K-2 



K-\ 



T 

-5 1 

-O^(l-e)*- 1 ) 

= (1 -^(/f - 1,D - (j^e^-'a - e) K 

= (l~-e)P(K-l,T)-( K T _ 1 )e T -( K - 1 Hl-e) K . 

(20) 
Since from (20) it follows that 

P{K,T)-P{K-l,T)=-{J_ 1 )e T -( K - l \l -e) K - 1 (21) 

is negative, P(K,T) is monotonically decreasing with K. 

B. Proof of Proposition 3.1 

1) To show that R t (K t ) is unimodal, it suffices to show 
that R t {K t ) is log-concave, i.e., R(K) = ±log(K) + 
log(P(K,T)) is concave. Since -^ log(AT) is concave, it suf- 
fices to show that for any given T, log(P(AT, T)) is concave, 
i.e., P(K, T) is log-concave. Based on the definition of log- 
concavity, in what follows, we will show that 



P(K, Tf > P(K - 1, T)P{K + 1, T). 
Based on (21), (22) can be rewritten as 

P(K -l,T)( T -f +1 )^ - P(K,T) >0. 



(22) 



(23) 



We use induction to show (23). For T = 1, 2, it is obvious to 
see that (23) holds. For T = 3, we can verify (23) by using 
(19). Assume that for T = t > 3, (23) holds. For T = t + 1, 
after some algebra, we have 

( t-K+2 \ 1-e 
't—K+X\ 1-e 



P(X-l,t + l)(t£±^-P(tf,i 



1] 



= P(jr-i jt )(*=£ti)i=s_P(iir,t) 

+ ±(P(K l,t)±=s - klJeM*- 1 ^ _ e) *) > 0, 

(24) 
which is based on the induction and (21). 

2) Since P(Kt,t) is monotonically increasing with £ for 
any Kt, Rt(Kt) is monotonically increasing with t. Besides, 
as t goes to infinite, lim R t (K t ) = K t , i.e., the block with 
length K t can be delivered almost surely. Therefore, for any 
K < K t , we can conclude that R t+1 (K t ) > R t +i{K). 
Since K t +\ is the optimal block size in slot t + 1, i.e., 
P t +i(#t+i) > R t+1 (K t ), if A t+ i < A t , then we have 



Rt+i(Kt+i) < Rt+i(K t ), which contradicts the fact that 
K t+ i is the optimal block size in slot t + 1. Therefore, 

Kt+i > Kt- 

C. Proof of Theorem 3.1 

The proof follows from a contradiction argument. Suppose 
that K't > Kt+i- It can be shown that K% < K t by a 
contradiction argument. From Proposition 3.1, it follows that 
Rt(Kt) > R t (Kt +l ) in slot t. Since K\ is the optimal action 
in slot t, V t {K* t ;V*) > Vt{K* t+l -V*). Since K* > K* +l in 
slot t, when the optimal policy is applied, the future reward 
Jt{Kt) under K% is less than the future reward J t (if t * +1 ) 
under K£ +1 , due to the less remaining time under K%. The 
future rewards under both actions are monotonically increasing 
with t, since Rt(K) is monotonically increasing with t for 
given K. Moreover, lim (J t (K?) - J t (K? +1 )) = 0, since the 

t— >oo 

probability of successfully delivering any given set of packets 
under any policy goes to 1, when the remaining time goes 
to infinity. This indicates that the gap between these future 
rewards decreases in slot t+1. Since Rt(K) is monotonically 
increasing with t, we have R t +\(Kf) > R t +i(K^ +1 ). There- 
fore, R t +x{K*) + J t+1 (K?) > R t+ i(K* +1 ) + J t+ i(K t * +1 ), 
i.e., in slot t + 1, the total expected reward under A' f * +1 is 
less than that under K£, which contradicts that A' f * +1 is the 
optimal action in slot t + 1. 

D. Proof of Theorem 3.2 

The proof follows from a contradiction argument. Suppose 
that Kl > K t . For any sample path, the case with K t will 
deliver the block earlier than the case with K% . For the sample 
paths with the number of slots that all the channels between the 
transmitter and the receivers are good less than A' t *, the reward 
under K t is higher than that under K%. For the other sample 
paths with the number of slots that all the channels between 
the transmitter and the receivers are good greater than K£, the 
block with size K t will be delivered earlier than that with size 
K£. We assume that after the block with size K t is delivered, 
the scheduler chooses to deliver the block with size 1, before 
the block with size K% is delivered. Then after the block with 
size K1 is delivered, the optimal policy is applied for both 
cases. Obviously, in this case, both cases will generate the 
same reward. However, for the case with block size K t , the 
policy that we applied after the block with size K t is delivered 
may not be optimal, which means that the reward under the 
optimal policy is no less than the reward of the policy we 
used. Therefore, the total expected reward under K? is less 
than that under K t , which contradicts the fact that K% is the 
optimal action in slot t. 



< Kj . Therefore, it suffices 



E. Proof of Corollary 3.1 

From Theorem 3.2, we have K 
to show that Kj < K t for any j G {1, ..., £}. From Proposition 
3.1, Kt is in the decreasing sequence of Rt(-) when R t (K t ) > 
Rt(K t + l). Therefore, it follows that K t < K t . 



F. Proof of Theorem 3.4 

When R t (l) > R t {2) holds, K(t) = 1, due to the unimodal 
property of Rt(-). Then, K*(t) = 1 from Theorem 3.2. Since 
K*(t) is non-decreasign with t (Theorem 3.1), K*(t') = 1 in 
the remaining slots t 1 > t, i.e., the plain retransmission policy 
is optimal. To show there exits a threshold e*, we expand 
R t {l) > R t {2) according to (2), where R t (l) = (l-e T ) N and 
Rt{2) = 2(l-e T +Te T -Te T ~ 1 ) N . Then, the monotonicity of 
e* follows from comparing Rt(l) with Rt(2) in the expanded 
form. Define /(e, t, N) = (l-e t )~2 1 / N \l-e t +te* -te*- 1 ) 
such that f(e*(t,N),t,N) = 0. Note that f(0,t,N) = 
1 - 2 l / N < and f(l,t,N) = 0. There exists a unique 
non-trivial value of e' in (0,1) to maximize f(e,t,N). For 
e < e', f(e,t,N) is first increasing and then decreasing 
back to 0. Therefore, there exists a unique non-trivial solu- 
tion of f(e*(t,N),t,N) = such that f(e,t,N) < for 
e < e*(t,N) and f(e,t,N) > for e > e*(t,N). 

G. Proof of Corollary 3.2 

lfN 2 >N 1 ,f(e,t,N 2 )>f(e,t,N 1 ).-FormyN,f(e,t,N) 
increases with e, achieves a positive maximum and decreases 
back to zero. Since f(e,t, N 2 ) > f(e,t,Ni), the value 
e*(t, Ni), i = 1,2, for which f(e*,t, N t ) = decreases from 
ei(t,Ni) to €2(^,^2). By following the similar arguments, it 
follows that e*(t,N) is monotonically increasing with t. 
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